Compiling Boostexter Rules into a Finite-state Transducer

نویسنده

  • Srinivas Bangalore
چکیده

A number of NLP tasks have been effectively modeled as classification tasks using a variety of classification techniques. Most of these tasks have been pursued in isolation with the classifier assuming unambiguous input. In order for these techniques to be more broadly applicable, they need to be extended to apply on weighted packed representations of ambiguous input. One approach for achieving this is to represent the classification model as a weighted finite-state transducer (WFST). In this paper, we present a compilation procedure to convert the rules resulting from an AdaBoost classifier into an WFST. We validate the compilation technique by applying the resulting WFST on a call-routing application.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

On Finite-State Tonology with Autosegmental Representations

Building finite-state transducers from written autosegmental grammars of tonal languages involves compiling the rules into a notation provided by the finitestate tools. This work tests a simple, human readable approach to compile and debug autosegmental rules using a simple string encoding for autosegmental representations. The proposal is based on brackets that mark the edges of the tone autos...

متن کامل

The OpenGrm open-source finite-state grammar software libraries

In this paper, we present a new collection of open-source software libraries that provides command line binary utilities and library classes and functions for compiling regular expression and context-sensitive rewrite rules into finite-state transducers, and for n-gram language modeling. The OpenGrm libraries use the OpenFst library to provide an efficient encoding of grammars and general algor...

متن کامل

An Efficient Compiler for Weighted Rewrite Rules

Context-dependent rewrite rules are used in many areas of natural language and speech processing. Work in computational phonology has demonstrated that, given certain conditions, such rewrite rules can be represented as finite-state transducers (FSTs). We describe a new algorithm for compiling rewrite rules into FSTs. We show the algorithm to be simpler and more efficient than existing algorith...

متن کامل

HFST - Framework for Compiling and Applying Morphologies

HFST–Helsinki Finite-State Technology (hfst.sf.net) is a framework for compiling and applying linguistic descriptions with finite-state methods. HFST currently connects some of the most important finite-state tools for creating morphologies and spellers into one open-source platform and supports extending and improving the descriptions with weights to accommodate the modeling of statistical inf...

متن کامل

A Method for Compiling Two-Level Rules with Multiple Contexts

A novel method is presented for compiling two-level rules which have multiple context parts. The same method can also be applied to the resolution of so-called right-arrow rule conflicts. The method makes use of the fact that one can efficiently compose sets of twolevel rules with a lexicon transducer. By introducing variant characters and using simple pre-processing of multi-context rules, all...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2004